<html>
<head><meta charset="utf-8"><title>0.2.x api is a Selection sort · t-cargo/PubGrub · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/index.html">t-cargo/PubGrub</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html">0.2.x api is a Selection sort</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="237973002"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/237973002" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eh2406 <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#237973002">(May 08 2021 at 17:47)</a>:</h4>
<p>I was continuing the experiments with the Cargo-&gt;Pubgrub script. It is a lot faster with the proposed 0.2.1 release. But it still has bean running for over and hour. Giving me lots of time to think.  The profiling data claims that we are dominated by <code>choose_package_version</code>. Staring at my implementation, wondering if an allocation here or there was the problem, I realized that our API requires us to do a Selection sort on the <code>potential_packages</code>.  Before each decision we need to call <code>choose_package_version</code> to tell us witch to pick and each call needs to look at all <code>potential_packages</code>. So the "look at" code is called <code>~ (#decisions ^ 2) / 2</code> times. I'd like to consider changing this for <code>0.3</code>. What are your thoughts?</p>



<a name="238153576"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/238153576" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Matthieu Pizenberg <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#238153576">(May 10 2021 at 14:42)</a>:</h4>
<p>I have not thought too much about that yet, and not sure what you mean by "changing this". In theory, we don't need to look at all potential packages, we could simply pick the first one. But as you explained in the guide, picking the one with the lowest amount of available versions is the cause of the "look at" each potential package but also provides a huge perf gain.</p>



<a name="238154360"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/238154360" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Matthieu Pizenberg <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#238154360">(May 10 2021 at 14:47)</a>:</h4>
<p>Once a new package is selected, all its dependencies that have not been selected yet become a "potential package". This also mean that maybe, a better heuristic would also take into account the number of dependencies a package has, not only the number of potential versions to take a decision. This would lower the number of <code>potential_packages</code> for all the following decisions. Of course, depending on a given version, the number of dependencies may change, but a simple heuristic could just consider the most likely version to be chosen (newest / oldest). Maybe the tradeoff between more computation for one decision (need to check the number of dependencies of each package) and reducing the number of accumulated "potential packages" during the search is worth it.</p>



<a name="238155203"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/238155203" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Matthieu Pizenberg <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#238155203">(May 10 2021 at 14:51)</a>:</h4>
<p>I'm not following where the "^2" is coming from. I'd say the "look at" code is called roughly <code>#decisions * mean#potential_packages</code></p>



<a name="238169729"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/238169729" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eh2406 <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#238169729">(May 10 2021 at 16:17)</a>:</h4>
<p>Each <code>potential_package</code> will become a <code>decision</code> in some later iteration. So <code>mean#potential_packages</code> ~= <code>#decisions / 2</code>.<br>
As to "changing this" I do want to keep that it is up to the user. And I have not figured it all out yet. I am thinking about having a priority queue, so the user provides a method <code>prioritize&lt;Priority: Ord&gt;(package: P, range: Range&lt;V&gt;) -&gt; (Priority, Option&lt;V&gt;)</code>. When we add a <code>potential_package</code> or change the range for a package we call <code>prioritize</code>, and put it in the queue. When we need to make a decision, we pop from the queue.</p>



<a name="238939790"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/238939790" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eh2406 <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#238939790">(May 16 2021 at 01:40)</a>:</h4>
<p>As a proof of concept <a href="https://github.com/pubgrub-rs/pubgrub/tree/priority-queue">https://github.com/pubgrub-rs/pubgrub/tree/priority-queue</a><br>
Looks like (as implemented) it scales better, but does more Hashes of P.<br>
So <code>elm</code> is ~5% slower as the cases are simple and str hash is not free.<br>
<code>large_case</code> is unchanged as it is simple and <code>u16</code> hash is basically free.<br>
<code>zuse</code> is ~21% faster as it has more complicated cases.</p>



<a name="239644355"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/239644355" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eh2406 <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#239644355">(May 20 2021 at 19:13)</a>:</h4>
<p>I built a stand alone file based on the <code>hobo</code> crate. When I tried to benchmark it with the <code>0.2.1</code> candidate I got:</p>
<blockquote>
<p>Benchmarking large_cases/hobo_str_SemanticVersion.ron: Collecting 100 samples in estimated 8161.2 s</p>
</blockquote>
<p>when I ran with this branch I got:</p>
<blockquote>
<p>Benchmarking large_cases/hobo_str_SemanticVersion.ron: Collecting 100 samples in estimated 712.05 s</p>
</blockquote>
<p>So there exist real inputs for witch this is a ~91% improvement.</p>



<a name="239773699"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/239773699" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Matthieu Pizenberg <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#239773699">(May 21 2021 at 15:59)</a>:</h4>
<blockquote>
<p>8161.2s</p>
</blockquote>
<p>That must have been quite painful! ^^</p>



<a name="239774334"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/239774334" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Matthieu Pizenberg <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#239774334">(May 21 2021 at 16:03)</a>:</h4>
<p>If you have time, could you also try the rather simple change of heuristic consisting in choosing the next package that has the lowest <code>number_of_versions + number_of_new_dependencies</code> or <code>number_of_versions * (1+number_of_new_dependencies)</code>?</p>



<a name="239774688"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/239774688" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Matthieu Pizenberg <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#239774688">(May 21 2021 at 16:05)</a>:</h4>
<p>I have a feeling that this simple heuristic change could result in dramatic improvements of worst case scenarios because it has the following effects:</p>
<ul>
<li>lower number of incompatibilities</li>
<li>lower number of potential packages</li>
<li>lower number of packages in the incompatibility buffer while doing unit propagation</li>
</ul>



<a name="239775439"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/239775439" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Matthieu Pizenberg <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#239775439">(May 21 2021 at 16:09)</a>:</h4>
<blockquote>
<p>number_of_versions + number_of_dependencies</p>
</blockquote>
<p>where <code>number_of_new_dependencies</code> here would be the number of new dependencies of the version that would be chosen if that package was picked. This implies that we would most likely want to cache that information (dependencies of that package at this version) to avoid re-fetching it when that package is eventually picked.</p>



<a name="239799912"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/239799912" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eh2406 <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#239799912">(May 21 2021 at 19:16)</a>:</h4>
<blockquote>
<p>8161.2s</p>
</blockquote>
<p>To be clear I did not actually let the benchmark run for ether of them. I thought the runtime estimate was more than enough to demonstrate the point. :-P</p>



<a name="239800367"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/239800367" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Eh2406 <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#239800367">(May 21 2021 at 19:20)</a>:</h4>
<p>I will be happy to try different heuristics! I would love to find a better one! But even if it is better, I don't think it will change the improvements from having to evaluate the heuristic less often.</p>



<a name="239800602"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/260232-t-cargo/PubGrub/topic/0.2.x%20api%20is%20a%20Selection%20sort/near/239800602" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Matthieu Pizenberg <a href="https://rust-lang.github.io/zulip_archive/stream/260232-t-cargo/PubGrub/topic/0.2E2.2Ex.20api.20is.20a.20Selection.20sort.html#239800602">(May 21 2021 at 19:21)</a>:</h4>
<p>Indeed those are two orthogonal things</p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>